Skip to content

Conversation

@qti-yuduo
Copy link
Contributor

Motivation:

QNN HTP was rejecting quantized BatchNorm models where parameters (scale, mean, var) come through DequantizeLinear nodes with per-channel INT8 quantization. This pattern is common in quantized models from quantization tools.

Changes:

  • Helpers to resolve BatchNorm params through DQ nodes to their underlying initializers
  • Support per-channel dequantization for BatchNorm parameters
  • Support input datatype of UFIXED_POINT_16
  • Add unit test covering this QDQ params configuration

@tianleiwu
Copy link
Contributor

From AI

Summary

This PR enhances the QNN EP to support BatchNorm operators where the parameters (scale, mean, var) are supplied via DequantizeLinear (DQ) nodes with per-channel INT8 quantization. Previously, these configurations were rejected by the HTP backend.

Key Changes

  • Parameter Resolution: Logic added to batch_norm_op_builder.cc to correctly trace back through DQ nodes to find the underlying initializers for batch norm parameters.
  • Quantization Support: Explicit support added for per-channel dequantization and UFIXED_POINT_16 data types.
  • Testing: Added a dedicated unit test BatchNormOpTests covering this specific QDQ configuration.

Review Analysis

Correctness

  • Gap Fill: This addresses a specific limitation where valid quantized models (likely from tools like olive or AMCT) were failing on QNN HTP.
  • Logic: The code to unwrap the DQ params seems consistent with how other operators handle QDQ inputs.
  • Selectors: Updates to qdq_selectors.cc ensure the optimizer doesn't prematurely fuse or reject these nodes incorrectly.

Performance

  • Hardware Acceleration: Enabling these nodes to run on the HTP (Hexagon Tensor Processor) instead of falling back to CPU (or failing) is a significant performance enabler for quantized models.

Conclusion

This is a necessary fix for broad support of quantized models on Qualcomm hardware. The implementation includes necessary builder logic and verification tests. LGTM.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for quantized BatchNormalization with per-channel DequantizeLinear parameters on the QNN HTP backend, which is a common pattern in quantized models from quantization tools.

Changes:

  • Refactored BatchNorm parameter preprocessing to support per-channel quantization through a new MaybeDequantizeParamTensor helper
  • Added support for UFIXED_POINT_16 and SFIXED_POINT_16 datatypes in BatchNorm operations
  • Updated QDQ node group selector to accept 3-5 DQ nodes (previously required exactly 3)

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
onnxruntime/test/providers/qnn/batch_norm_test.cc Added test case for BatchNorm with per-channel quantized parameters (scale, mean, var)
onnxruntime/core/providers/qnn/builder/opbuilder/batch_norm_op_builder.cc Implemented per-channel dequantization support, added 16-bit datatype support, and refactored parameter resolution logic
onnxruntime/core/optimizer/qdq_transformer/selectors_actions/qdq_selectors.cc Updated selector to allow variable number of DQ nodes (3-5) for BatchNorm inputs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yuslepukhin yuslepukhin added the ep:QNN issues related to QNN exeution provider label Jan 13, 2026
Copy link
Contributor

@adrianlizarraga adrianlizarraga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple questions

@qti-yuduo qti-yuduo force-pushed the dev/yuduo/qdq-bn-cl branch from 86f3192 to 3b58106 Compare January 15, 2026 01:00
@qti-yuduo qti-yuduo requested a review from edgchen1 January 15, 2026 03:14
@adrianlizarraga
Copy link
Contributor

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

Copy link
Member

@yuslepukhin yuslepukhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@yuslepukhin
Copy link
Member

/asp run MacOS CI Pipeline

@yuslepukhin yuslepukhin enabled auto-merge (squash) January 15, 2026 22:07
@yuslepukhin yuslepukhin merged commit bbd3850 into microsoft:main Jan 16, 2026
99 of 108 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:QNN issues related to QNN exeution provider release:1.24.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants